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I.   Introduction 

The  need  for  readily  computed  parameter  estimates  is  great. 
Maximum  likelihood  estimators  are  known  to  be  asymptotically  efficient,  but 
often  they  are  very  hard  to  find.   The  most  popular  alternative  is  the  method 
of  moments  which  usually  yields  readily  computed  estimates,  but  sometimes 
these  estimates  are  not  very  efficient.    This  report  looks  at  the  efficiency 
of  the  method  of  moments  and  of  some  similarly  constructed  'quasi-method 
of  moment 'estimators. 

The  basic  idea  is  to  select  a  system  of  estimating  equations  which 
equates  various  statistics  to  their  expected  values.   The  method  of  moments 
does  this  for  sample  moments  of  order  one,  two,  etc.   We  propose  to  consider 
also  moments  of  order  zero,  minus  one,  and  perhaps  other  functions.   The 
examination  of  the  consequent  efficiencies  may  aid  in  the  building  of 
intuition  so  that  a  wiser  selection  of  estimating  functions  can  be  made  in 
new  situations  when  they  appear. 

Moments  of  order  zero  and  minus  one  require  positive  data.   The  former 

is  the  geometric  mean  and  the  latter  is  the  harmonic  mean.   They  form  part  of 

a  general  family,  f(r)  =  [—  V.  x.l    ,  of  nondecreasing  means  (where  x. , . . . ,x 

n  Ll     x  In 

forms  the  data).  When  dealing  with  the  harmonic  mean,  we  will  be  setting 

1  r  1  -1 

—  I —  (=  [f(-l)]   )  equal  to   E(l/X)   because,  for  the  parent  populations 

i 

treated  here,  the  latter  value  is  easy  to  obtain.   Similarly  when  dealing 

with  the  geometric  mean,  we  will  be  setting  —  )1  In  x.  (=  In  f(0))   equal  to 

n  u±.  l 

E(ln  X) .   In  this  form,  the  geometric  mean  appears  in  many  maximum  likeli- 
hood systems.   This  suggests  that  alternative  quantities,  that  are  close  to 
means  with   r  =  0,  might  be  profitably  exploited.   The  results  give  some 
support  to  this  idea. 

The  structure  of  the  efficiency  computations  utilizes  the  theoretical 

work  presented  in  a  companion  report  [12].   The  pertinent  material  is 

1 


summarized  in  Section  II  of  the  present  report.   Section  III  contains 
applications  of  the  idea  to  two,  one-parameter  parent  populations,  Poisson 
and  symmetric  beta.   Some  speculations  for  interpreting  these  results  are 
made.   Section  IV  is  devoted  to  two  parameter  settings;  gamma,  negative 
binomial,  and  beta. 

Much  computational  work  is  necessary  to  support  this  development  and 
the  details  are  relegated  to  the  appendices.   Appendix  A  contains  general 
relationships  among  the  required  population  moments.   Specialization  to 
Poisson,  gamma,  negative  binomial,  and  beta  is  contained  in  Appendices 
B,  C,  D,  E  (resp.).   Computations  are  performed  by  APL  programs  and  these 
are  included  too,  as  Appendix  F. 


II.   Methodology 

It  is  assumed  that  the  estimating  equations  have  the  form 

g(x,6)  =  0  (2.1) 

where   x  =  (x, ,...,x  )   is  the  data  vector  of  a  random  sample  from  the  specif ie 
In 

parent  population,  6'  =  (0  ,...,8  )   is  the  parameter  point  which  belongs  to 

an  open  subset  of  p-dimensional  space,  and   g1  =  (g. , . . . ,g  ).   Primes  denote 

transpose.   In  order  for  the  system  (2.1)  to  have  a  unique  solution  9,  it 

is  necessary  (by  the  Implicit  Function  Theorem  [16])  that  the  Jacob ian 
%Sy  •  ••  »g 


J  = 


dV"°p 


f   0. 


The  following  structural  assumptions  are  made: 

(i)   Each   g.(x,  9)   for   j  =  l,...,p   is  a  symmetric  function  of   x,  i.e, 

is  invariant  under  permutations  of   x, , . . . ,x  . 

r  In 


(ii)   E{g  (X,9)}  =0   for   j  =  l,...,p. 
(iii)   Var{g .(X,6)}  -  ~  C  (9)  +  o(^)   for  j=l,...,p. 
(iv)   The  g.(x,0)   have  bounded  continuous  partial  derivatives  with 
respect  to  0  ,  .  . .  ,9    for   j  =  l,...,p. 
(v)   9,  which  is  the  solution  of  (2.1)  is  consistent  and  asymptotically 
normal . 


Assumption  (i)  is  modest  and  expected  of  any  reasonable  estimating 
scheme.   The  meeting  of  assumptions  (ii)  and  (iii)  is  a  matter  of  scaling 
and  arrangement.   Assumption  (iv)  is  needed  so  that  the  asymptotic  covariance 
matrix  is  well  behaved.   Assumption  (v)  is  always  desirable  and  convenient 
since  it  implies  that  the  estimators  are  asymptotically  unbiased  and  the 
ellipsoid  of  concentration  [4]  is  characterized  in  terms  of  the  inverse  of 
the  asymptotic  covariance  matrix.   Efficiency  computations  are  based  on  its 
determinant. 

Equipment  for  verifying  (v)  is  contained  in  [9].   There,  the  functions 

1  vtl 
g.   are  averages  of  the  form  —  £._,  g.(x.,9)   for   j  =  l,...,p   and  this 

structure  is  consonant  with  the  present  set  of  assumptions.   All  of  the 

cases  treated  in  this  report  have  this  structure. 

Much  license  is  taken  in  what  follows.   The  purist  is  referred  to  [9]. 

Let  A(x,9)   be  the  p  by  p  matrix  of  partial  derivatives  {3g./30,}. 

J    ^ 

Assume  that  the  elements  behave  as  in  (2.2) 


A..  (X,9)  ->  E{3g./39.  }  (2.2) 


as   n  -*  °°.   The  resulting  limiting  matrix  is  denoted  by  A  or  A(S) 
The  assumptions  allow  the  first  order  expansion 


g(x,e)  =  g(x,e)  +  A(x,e  +  p(e-0))(e-e)       (2.3) 

where   p  is  a  diagonal  matrix  of  random  numbers  belonging  to  the  interval 
[0,1].   Since  the  system  is  soluble,  g(x,9)  =  0  and  we  can  write 

(e-e)  =  -a_1(x,  e  +  p(e-e)  g(x,e))  (2.4) 

since  the  continuity  of  A  implies  that  of  A    and  of   g.   Letting  the 
asymptotic  covariance  matrices,  as   n  -»■  °°,  be  defined  by 

M  =  limit  nE(9-6)(0-e) '  (2.5) 

C  =  limit  nE{g(X,9)  g'(X,9)}  (2.6) 

it  follows  from  (2.2)  and  (2.4)  that 

M  =  A_1  C(A')-1  (2.7) 

The  method  of  maximum  likelihood  fits  into  this  setting.   The  parent 
population  has  density   f(x,9),  and  (2.1)  takes  the  form 

n   8  Jin  f(x.,0) 

nSr  °n  J,     98   *    =0       f«rr-l.....p    (2.8) 

1=1      r 

Assumption  (ii)  requires  the  regularity  conditions  [10]  as  does  the  relation- 
ship 

x         r   k    ' 
where   A  is  the  information  matrix.   Using  (2.2)  and  (2.6)  it  is  seen  that 
both  A  and   C  are  equal  to  A   and  hence 


M  =  A  X  (2.10) 


follows  from  (2.7) 


Now  suppose  that  only  the  first   q   likelihood  equations  (2.8)  are 
used  in  the  system  g  =  0.   Let  us  denote  this  subset  by  p  =  0,  and  the 
remaining  p-q   equations  by  h  =  0.   Thus  in  partitioned  form  (2.1)  becomes 

g  =  O    =  0  (2.11) 

h 

Proceeding  formally,  (2.6)  becomes 

C  =  limit  n  j  /  *  <  (2.12) 

(  E(h'u)      E(hh')  )     (  C21     C22  ) 

The  information  matrix  can  be  partitioned  likewise 

(  All   A12  | 
A  =  <         X  J  (2.13) 

(  An   A22 
Further,  define  a   p-q   by   q  matrix 


g21  -  {E(3h./39k)},        j  =  l,...,p-q;  k  =  l,...,q     (2.14) 

and  a  p-q   order  square  matrix 

g22  =  {E(3hj/39k)},        j  =  l,...,p-q;  k  =  q+l,...,p   (2.15) 

It  is  shown  in  [12,  Sec.  IV],  that 


C  =  j  (2.16) 


where  C„9   is  as  (2.12), 


'21      22 


■Au    -A12  | 

(2.17) 
B21     °22  t 

5 


and 


,  All    A12 
M   =  <  (2.18) 

An     H 


where 


H  =  A21G11A12  -  A21G12g22  -  (A21G12g22)'  +  g22G22822 
Gll  =  {A11  "  g12C22g2irl 


G12  "  A11S21G22 


(2.19) 


G22   =   {C22   -   S2lAng2irl 
Because   of    (2.7),    the  determinant   of    (2.18)    is 

|m_1|   =   |a|2/|cj  (2.20) 

and  it  is  useful  to  draw  attention  to  its  computation.   Using  partitioned 
form,  see  [6,  p.  165],   its  ingredients  have  determinants 


C|  -  |AUI|C22  -g21A-Jg12l 


a|  =  |A11||g22  -  g2iAnA12l 


(2.21) 


and  this  is  useful  in  computing  the  asymptotic  efficiency,  [12] 

Eff(9)  =  |m-1|/|a|  (2.22) 


In  the  special  case  of   p  =  2   and  q  =  1,  (2.19)  reduces  to 


H  =  T^T  U12C22  -  2A12§21g22  +  All4>}         (2"23) 
and  the  determinant  of  M    reduces  to  the  especially  convenient  form 

|M-1,  .  '"U«22  -  A12S22>2  (224) 

A11C22  •  g21 

and  the  denomination  of  (2.24)  is   |c|. 


III.   Single  Parameter  Settings 


Poisson. 


The  Poisson  random  variable  X  has  density 

f(x;A)  =  e~AAX/x!  for   x-0,1,...  (3.1) 


and  the  derivative  of  the  log  likelihood  is 


S  =  f  -1  (3.2) 


It  is  well-known  that 


A  =  E(S2)  =  j  (3.3) 


and  the  sample  mean  is  the  minimum  variance  unbiased  estimator  for  all  sample 
sizes.   It  is  also  the  maximum  likelihood  estimate  and  the  method  of  moments 
estimate. 

Let  us  look  further  solely  for  academic  purposes.   Since  the  Poisson 

has  the  property  that  its  mean  is  equal  to  its  variance  it  follows  that  the 

2 
sample  variance   s    is  also  a  "moment"  estimate  of   \.   Moreover,  the  idea 


can  be  extended.   There  are  many  other  statistics  that  can  be  equated  to 
their  expected  values  and  the  resulting  equations  solved  uniquely  for  A . 
One  other  will  be  considered  here,  namely,  the  averages  of  reciprocals  of 
the  {1  +  x.},.   The  one  is.  added  as  a  convenient  device  for  avoiding 

division  by  zero.   The  result  will  be  called  the  harmonic  mean  estimator. 

2 
Since  s   is  directly  an  estimate  of  A ,  its  asymptotic  efficiency 

is  quickly  and  easily  expressed.   Using  (2.22),  (3.3),  and  (B.5)  we  have 

Eff(s2)  ■ -  (3.4) 

A  +  2A 


The  harmonic  mean  estimator  is  characterized  by  the  equation  (see  (B.7)) 

y  -  |-  (1  -  e"X)  =  0  (3.5) 


where 


i  n    i 


Equation  (3.5)  is  in  the  form  g  =  0   (i.e.  (2.1))  and  satisfies  the  assump- 

■k 

tions .   The  resulting  estimate,  A  ,  can  be  computed  using  the  iteration 
function  that  falls  naturally  out  of  (3.5),  namely, 

\+1  '  7  (1  -  e  r)  (3.7) 


and   A  ■*■  X        for  all  initializations   A~  >  0 ,  but  convergence  can  be 
r  0  ° 

quite  slow  if  a  poor  A    is  chosen.   Then  asymptotic  variance   C   (see 
(2.6))  of  (3.5)  is  the  variance  of   (1  +  X)    which  can  be  computed  from 
(B.7)  and  (B.8).   The  quantity  A  of  (2.2)  is  obtained  by  differentiating 
(3.5)  with  respect  to  A, 


A  =  \   (1  -  e  X   -   Ae~A)  . 
A 


(3.8) 


*     2 
Using  (2.22),  (2.7)  we  have   Eff(A  )  =  A  A/C,  or  more  explicitly 


Eff(A*)  =  4r  (1  -  e~X  -  Ae  V/Var(-^) 

A 


1+X; 


(3.9) 


The  efficiences  of  the  two  estimators  are  compared  in  Figure  3.1. 
Of  course  there  is  no  point  in  using  any  estimator  other  than  A  =  x,  but 
the  graph  suggests  that  the  harmonic  mean  may  be  profitably  used  in  other 
problems  in  which  a  choice  must  be  exercised. 


Figure  3.1 
Asymptotic   Efficiency 
(  Poisson  ) 


Eff    (X*) 


18      20 


Just  for  fun,  let  us  compare  the  values  of  the  three  estimators   x, 

2   * 

s  ,  A   when  applied  to  some  famous  data.   First,  the  number  of  deaths  due 

to  mule  kick  in  200  Prussian  army  corps  years.   The  frequency  counts  are 
109,  65,  22,  3,  1  for  variable  values  0  thru  4  [7,  p.  109].   The  estimators 
are 

x  =  0.61  ,         s2  =  .6109  ,         A*  =  .6093 

and  agreement  is  rather  good.   Second  (Rutherford  Chadwick  data) ,  the 
number  of  radioactive  disintegration  in  2608  time  intervals  of  7.5  seconds 
each  [4,  p.  150].   This  time  we  have 

x  =  3.867  ,        s2  =  3.633  ,         A*  =   3.886  . 

The  sample  size  is  much  larger  but  the  value  of   A   is  in  a  range  of  lower 

2 
efficiency  for   s  .   Also,  the  radioactive  decay  is  more  properly  modeled 

2 
with  a  pure  death  process  and  this  may  help  explain  the  smaller  value  of   s 

One  final  comment.   The  convergence  of  the  iteration  function  (3.7) 

has  importance  in  finding  the  maximum  likelihood  estimate  for   A   from  a 

Poisson  population  in  which  the  zero  values  have  been  truncated.   That  is, 

a  trial  that  produces  no  counts  does  not  come  to  the  experimenter's 

attention  [10,  p.  3-13].   The  density  is 


e-A    Ax 
f  (x,A)  =  —  — r  ,         x  =  1,2,  .. 

1A   X  • 
-  e 


and  the  maximum  likelihood  equation  is 

A 


x 


1  -  e 


which,  as  a  function  of  A,  is  the  same  as  (3.5) 

10 


Symmetric  Beta 

A  Beta  random  variable  with  equal  parameter  values  has  density 

f(x,a)  =  F(2a)?  xa(l-x)a    for    0  <  x  <  1,  0  <  a         (3.10) 

[r(a)]z 

Using  the  psi  function  [1], 

♦«■>  ■ d  lndJM  <3-u> 

one  can  write  the  derivative  of  the  log  density  as 

Sa  =  2«K2a)  -  2^(a)  +  ln(x(l-x))  (3.12) 

and 

A  =  2^'(a)  -  4<j/(2a)  (3.13) 

1  rn 


by  (2.9).   Let   In  x  =  —  Y .    In  x.   and  express  the  maximum  likelihood 

n  ^i=l     l        r 

equation  as 


4Xa)  -  ip(2a)  =  |  {In  x  +  ln(l-x)  }  (3.14) 


so   a,  the  maximum  likelihood  estimate,  is  a  function  of  the  geometric 

mean  of   x  and   1-x.   Also,  it  is  difficult  to  find. 

Let  us  turn  to  the  method  of  moments.   Because  of  symmetry,  see  (E.ll), 

(E.12), 

2      1 


U  =  TT  ,  ■■   = 


2  '  4(2a  +  1) 

Thus   x   cannot  be  used  and  we  turn  to  the  second  moment.   The  form  g  =  0 
becomes 

s2  "  ,  in    \  n  "  0  (3.15) 

4(2cc  +  1) 

and  the  estimator  can  be  found  explicitly 

11 


a  =  -^  -  \  (3.16) 

8s 


and  using  (2.2),  (2.7)  produces 


n  Var(a)  =  4(2a  +  l)4  Var(s2)  (3.17) 


The  right  side  of  (3.17)  requires  that  (E.2)  be  used  with  (A. 2).   The  result 
is  not  compact  and  is  produced  only  by  computer  program. 
A  harmonic  mean  option  is  available  since 

(see  (E.13))  for  a  >  1,  and  the  variances  are  finite  if  a  >  2.  Easy 
calculations,  exploiting  (E.14)  and  (E.15),  show  that  both  y  and  z, 
where 


i  ?  i  i  ?    i 

-1-1  1        1 


should  be  used  instead  of  either  one  alone  in  order  to  reduce  variability, 
We  select  the  form  g  =  0   to  be 

(a  -  l)(y  +  z)  -  (4a  -  2)  =  0  (3.20) 

and  solve  explicitly 

«'-^H  (3.21) 

Application  of  (2.2),  (2.7),  (E.14),  and  (E.15)  produces 


n  Var(a*)  =  (2a  ~ ^   '   1)2     for  a  >  2  (3.22) 
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Figure   3.  2 
Asymptotic   Efficiency 
(  Symmetric    Beta  ) 


The  efficiencies  of  these  two  estimates  are  compared  in  Figure  3.2. 
The  second  moment  estimator,  a,  is  uniformly  better  than  the  harmonic  mean 
estimator,  a  ,  and  this  result  needs  interpretation  in  the  light  of  the 
success  in  the  Poisson  case.   This  time  the  averages  of   X    and   (1-X) 
were  used  instead  of   (1+X)   ,  the  latter  being  difficult  to  manage  with 
the  beta  distribution,  and  the  parameter  must  be  at  least  two  in  order 
for  the  variance  to  exist.   Also  the  positive  sample  space  is   0  <  x  <  1, 
which  entails  large  and  variable  values  for  the   {x.  },  and  this  may  explain 
the  lack  of  success.   The  estimator  a   may  still  have  some  uses  because 
(unlike  a)  we  have  a  simple  explicit  expression  for  its  variance  (3.22) 
and  its  efficiency  is  competitive  for  a  more  than  (say)  four  or  five. 
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IV.   Two  Parameter  Settings 
Gamma 
The  gamma  random  variable   X  has  density 


f(x;a,S)  =  — xa   1  e   x/B  (4.1) 

T(a)6a 


and  the  partial  derivatives  of  its  logarithm  are 


c     x    a 

b6     2  "  6 


(4.2) 


S   =  In  x  -  In  3  -  iKa) 

a 


where  (J)  (a)   is  the  psi  function  (derivative  of  log  gamma).   The  maximum 

likelihood  estimators  (a, 3)   is  found  from  (4.2)  by  replacing  x  with  x 

1  rn 


J.  rll 

and   In  x  with   In  x  =  —  } .  In  x..   Thus  it  fits  into  our  generalized 

n  Ll  l  ° 

moment  scheme  utilizing  the  arithmetic  and  geometric  means.   To  solve  the 
system,  one  sets  3  =  x/a   into  the  second  member  of  (4.2)  and  search  for 


the  root  of   In  a  -  \\j  (a)  =  In  x  -  In  x,  which  requires  a  psi  function 
capability  [1,3].   This  is  not  difficult  with  a  large  computer,  but  may 
be  a  challenge  for  a  small  one,  e.g.  a  hand  held  calculator.   Viable 
alternatives  are  available  using  the  ordinary  method  of  moments  and  a 
generalization  that  exploits  the  harmonic  mean. 
The  information  matrix  (2.9)  is 

a/32      1/3 


j  a/3       1/3   ) 

A  =  (4.3) 

(  1/3       i^'(a)  ) 


whose  inverse  is 
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aip  '  (a  )-l  J 


B%'(a) 


(A. 4) 


where  primes  denote  derivative. 

-        2  2 

The  ordinary  method  of  moments  equates   x  and   s    to  otB   and  aB  , 

respectively,  and  the  resulting  estimator  is 

B  =  s2/x  ,  a  =  x2/s2  (4.5) 


Using  9   =  B   and  8?  =  a   in  order  to  conform  with  (4.2),  one  can  apply 


(2.2)  to  obtain 


(a      &        \ 

A  =  /  (4.6) 

(  2aB     B   ) 


Note:   Although  the  method  of  moments  shares  an  equation  with  maximum  likeli- 

_   2 
hood   (x/B   -  a/B  =0),  there  is  no  advantage  to  using  this,  through  (2.18), 

in  this  case. 

The  matrix  C   is  obtained  from  (C.5),  (C.6),  and  (C.7).   Then  (2.7) 

can  be  applied  to  get 

£_  2a  +  3         _  , 
a   2(a+l) 

M  =  2(a+l)    <  (4.7) 


and 


|m|  =  2(a+l)B2  (4.8) 


Turning  to  another  choice,  let  us  recognize  that  the  ordinary  method 
of  moments  uses  moments  of  order  one  and  two,  while  maximum  likelihood 
uses  moments  of  order  one  and  zero.   Heuristically  one  might  find  advantage 
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in  moments  of  order  one  and  minus  one.   Consider  for  the  system  (2.2) 
(see  (C.2)  and  (C.8)), 


x  -  ct8  =0 


y  -  s^iy =  ° 


(4.9) 


where 


1  rn 


e  y  =  —  )  .  1/x.   and  the  consequent  estimators 
'   n  u  1    i 


*  —  1  *        XV 

8   =  x  -  -  a   =  — 5* (4.10) 

y  xy  -  1 

which  satisfies   3   >  0  and  a   >  0   since  the  arithmetic  mean  is  larger 
than  the  harmonic  mean.   Proceeding  as  before  we  obtain,  for  a  >  2, 


* 

A  = 


r     *  ) 

{   8  (a-1)     S(a-l)Z) 


(4.11) 


and 


2   2q  -  3 
2(a-l)2 

M*  =  2(a-\}     \  (4.12) 

a  -  2 


and 


1 


2 

|M*|  =  2g2  (a"1)0  for  a  >  2  (4.13) 

a  -  2 


The  asymptotic  efficiencies  of  the  estimators  (4.5)  and  (4.10)  are 
computed  from  (2.22)  using 

|A|  =  \   (c^'(a)  -  1)  (4.14) 

These  efficiencies  do  not  depend  on  the  scale  parameter,  8,  and  can  be 
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plotted  as  functions  of  a ,  as  is  done  in  Figure  4.1.   Also,  the  former 
appears  in  [15].   The  alternative   (a  ,3  )  catches  up  to  (a, 8)   at  a  =  3 
and  remains  better  thereafter.   The  relative  efficiency  of  the  latter  with 
respect  to  the  former  is  (see  (4.13)  and  (4.8)) 


M 


(q-D 


M|     (a-2)(a+l) 


(4.15) 


It  is  less  than  100%  for  all  a  >  3  and  falls  below  90%  for   4  <  a  <   7 


Figure  4.1 
Asymptotic  Efficiency 
(  Gamma  ) 


Eff  (oc,  /3    ) 


/-n-*       r+*j 


Eff  (oc,  /3  ) 


FIGURE  4.1.   Asymptotic  Efficiency  (Gamma) 
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As  a  numerical  example,  the  three  estimators  were  applied  to  135 
observations  on  the  service  time  of  the  check-in  queue  at  a  major  hotel. 
The  gamma  distribution  fits  quite  acceptably  according  to  chi  square 
criteria.   The  results  are: 

B  =   .303         B  =   .305         B*  =   .298 
a  =  6.33  a  =  6.31  a*  =  6.45 

Counterexample :   This  opportunity  is  taken  to  show  that  the  representation 
(2.18),  which  is  the  main  theorem  of  [12]  ,  is  false  when  one  drops  the 
assumption  that  the  subsystems  u  =  0   (2.11)  are  likelihood  equations. 
We  use  the  system 

1 


y   B(a-l) 


=  0 


2     2 
s   -  aB   =  0 


which  has  an  equation  in  common  with  the  method  of  moments  (4.5),  but  this 
common  equation  is  not  part  of  the  maximum  likelihood  system.  This  system 
has  explicit  solutions 


•if  i   n 


:  i  j    i     /  i  .,  2  ~*     .      i 

■x    \ +/— r+4s   ;  ,         a-  =  1  +  — — 

2  )   v  J       2       (  ■ 


|     y     1  /  j' 

One  readily  develops,  using  (4.11)  and  (4.6) 


A   = 


62(a-l)      S(a-l)2 


2a  B 


I 


and  from  (C.7),  (C.9),  (C.ll) 
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2(a-l)2  (a-2) 


C   = 


2aB4(a+3) 


Using  M   ■  A'C  A,  we  calculate 


2a  a-2 


"*-l 

M 


;2(a-l)2(,-2)   B2(a+3)     B«-«   6(a+3) 


a-2   +   1  a-2 


B(a-l)     6(a+3)       (a-1)2    2a(a+3) 
and  this,  clearly,  has  no  submatrix  in  common  with  (upon  inverting  (4.7)) 


^       6      B"  2a  +  3  ( 


a  2(a+l) 

Negative  Binomial 

The  negative  binomial  density  is  parametrized 

f(x;r,p)  =  ^^y  qXpr        for  x  =  0,1,2,...       (4.16) 

0  <r,  0  <p  <1,  p+q  =  1 

and  the  partial  derivatives  of  its  logarithm  are 


Sp  =  r/p  -  x/q 


S  =  iKr+x)  ~  <Kr)  +  In  p 


(4.17) 


Using  the  basic  recursive  formula  for  the  psi  function  [1], 


L9 


,(r+x)  -  ^(r)  =  I 
j-1 


one  may  express  the  system  of  maximum  likelihood  equations  as 

x  -  rq/p  =  0 

(4.18) 
x. 
x 


ave      I . 

1=1,. ..,n  j=l 


+  In  p  =  0 


Solving  (4.18)  is  quite  difficult  to  manage  without  a  large  computer. 
Appendix  F  contains  an  APL  program  (PSIB)  which  accomplishes  it. 

The  information  matrix  can  be  developed  using  (2.9).   The  result  is, 

=  p   and   8   =  r, 

r/qp       -1/p 


using 


j  r/qp       -1/p  j 

A  =  (  (4.19) 

(  "1/P        A   1 


'22 


where 


h22   =   i|;'(r)  -  E(^»(r+X))  (4.20) 

The  properties  of  A__   are  developed  in  Appendix  D  (see  (D.22)  and  (D.26)), 
along  with  the  series  representation  of  the  determinant  (D.25), 


00    n        i 

2  in+1   (n+r)!  ^'Zi; 

p   n=l 


which  converges  rapidly  for  most  values  of   p.   For  purposes  of  computation, 
one  may  as  well  use  (4.21)  in  (4.19)  and  solve  for  A99,  thus 

A22  -^{1  +  P2|A|}  (4.23) 
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To  estimate  by  the  ordinary  method  of  moments,  we  use  (D.5)  and  form 
the  system  (2.1)  to  be 


px  -  rq  =  0 

(4.24) 


2  2 
p  s   -  rq  =  0 


The  explicit  solution  is  readily  seen  to  be 


-2 

P  =  -^  ,  r  =   2  X  _         (4.25) 

s  s"  -  X 


and  it  cannot  be  guaranteed  that  p  <  1  and  r  >  0.   Differentiating  (4.24) 
and  taking  expectations  yields  the  matrix 


r/q 

rO+gl 
P 


Combining  (2.6),  (4.24)  and  (D.6)  produces 


(1  1  +  q 

=  rq 


C  =  rq  (4.27) 


(  1  +  q    1  +  2(r+2)q  + 


and  finally  the  formula  (2.7) 


:(r+l)    ) 

\         pq  r    ; 


M   =    2U2X/ 

rq 


and 


Ml    =   2(r+l)p2/q  (4.29) 
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Let  us  turn  to  the  question  of  using  a  different  moment  in  the  second 
equation.   Since  the  negative  binomial  has  positive  probability  mass  at 
zero,  we  use  averages  of  the   (1+x.)    as  was  done  when  the  Poisson  case 

was  considered.   Perhaps  that  level  of  success  can  be  matched. 

L  r  n 
Letting  y  =  —  1 1  l/(l+x.),  use  for  the  system  (2.1) 


px  -  rq  =  0 

(4.30) 

(r-l)qy  -  p  +  pr  =  o 

because  of  (D.7).   The  system  (4.30)  cannot  be  solved  explicitly,  but  it 

can  be  managed  with  a  hand  held  calculator.   Use  the  first  member  to  obtain  t 

as  a  function  of   p  and  its  derivative 

r-^,  ^  =  -\  (4.31) 

q  dp    q2 

and  substitute  into  the  second  member  of  (4.30)  to  obtain  a  function   f 
of  the  form 

f(p)  =  P   +  (r-l)qy  -  p 

=  pr  -  y  +  p(y  -  1  +  xy)  (4.32) 


and  having  derivatives 


r- 


df    f  -i  ■  -  \    P  x(q  +  In  p) 

—  =  (y  -  1  +  xy)  +  ^ K- 

q 


The  solution  of   f(p)  =  0   can  be  obtained  by  Newton's  method,  always 
remembering  to  update   r   as  well  as   p.   The  initialization  p  =  .5   and 
r  =  x  appears  to  be  satisfactory,  but  normally  convergence  would  be  faster 
if  the  moment  estimators  (A. 25)  are  used. 
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The  resulting  estimator  will  be  denoted  by  p  ,  r  .   From  (4.32)  it 
is  seen  that 

f (0)  =  -y  <  0  ,  f(l)  =  xy  >  0 

so  that  0  <  p   <  1,  and   r   >  0   follows  from  this  by  (4.31). 

Let  us  turn  to  the  development  of  the  asymptotic  covariance   structure 
of   p  ,  r  .  Taking  partial  derivatives  and  the  expectation  of  (4.32)  to 
produce  the  coefficient  matrix  A  of  (2.2)  yields 

r/p  -q 


I 


A*  4.3 


r 

V  rp 


1   1  -  p        p  -  p     r  . 

-  c—  x !r 1-  p   In  p 

q  r  -  1 


with  the  help  of  (D.7).   Attention  is  drawn  to  the  fact  that 


limit  P  ~  P  =  p  -  p  In  p  (4.34) 

r  -*■   1 


The  covariance  matrix  C  of  (2.6)  is  derived  from  (4.30)  with  the  help  of 
(D.5),  (D.7),  and  (A. 5).   The  result  is 

I  rq  rqpr  -  p(l-pr)   \ 

C*  =|  (4.34) 

I      r       '1    r\  l         1\2  2  T7    /  1  x  I 

Vrqp  -  pU-p  )      (r-1)  q  Var(^^)/ 

Let  us  draw  attention  to  the  fact  the  first  equation  of  (4.17)  is  a  multiple 
of  the  first  equation  of  (4.32)  rather  than  being  identical.   This  is  the 
reason  that  (4.33)  and  (4.35)  are  modifications  of  (2.17)  and  (2.16)  rather 
than  exact  analogies.   Thus  the  bookkeeping  that  follows  must  be  done 
carefully. 
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The  direct  development  of  the  matrix  M   from  (2.7)  with  (4.34)  and 
(4.35)  as  input  is  messy.   Instead  let  us  recognize  that  (4.30)  shares  an 
equation  with  the  likelihood  system  (4.18)  and  use  (2.18).   Thus,  with  A 
given  by  (4.19),  we  have 

r/qp 

*-l 

M  (4.36 

-1/p      M2 
where  M    is  obtained  from  (2.23)  and   |C|   from  (2.16).   Thus 


M22 

= 

1 
!C| 

J    r 

(qp 

g22  + 

2 
P 

|0| 

= 

r 
2 

qp 

C22   " 

2 
821 

+  C22) 
2( 

P  '  (4.37) 


and   (g?1 ,  g99)   is  the  second  row  of   A   in  (4.33)  and   C0O   is  taken  from 
(4.35).   Using  (2.24)  we  obtain 


and  (4.36) 

|c|  =  \   (r-l)2q  Var^)  -  g^ 


The  asymptotic  efficiencies  of   (p,r)   and  (p  ,r  )  appear  in  Tables 
4.1  and  4.2,  resp.,  for   p  =  .1(.1).9  and   r  =  .5(.5)5,  6(1)19   where  the 
parentheses  indicate  the  indices  of  advancement.   The  efficiency  of   (p,r) 
is  monotone  increasing  in  r   for  each   p.   It  is  lower  for  the  smaller 
values  of   p.   The  efficiency  of   (p  ,r  )   is  high  for  the  smaller  values 
of   r  and  decreases  (generally).   It  is  not  monotone  for  p  =  .1,  .2. 
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The  relative  efficiency  of  p  ,  r   with  respect  to  p,  r,  i.e., 

Rel  eff  =  |M  |_1« |M| 

appears  in  Table  4.3.   Generally  p  ,  r   is  preferable  for   r   less  than 
or  equal  to  (say)  2.5.  In  general   p,  r   is  preferable  elsewhere,  but  it 
does  not  matter  much  for  small  values  of   p. 

The  three  estimation  schemes  were  applied  to  the  Cricket  score  data 
of  Reep,  Pollard  and  Benjamin  [13],  which  provided  the  following  comparisons 


Cowdry         Barrington        Graveney 

x  1.692  2.095  1.570 

s2  4.343 


mi 

igton 

2 

.095 

4 

.939 

.538 

116 

.345 

1 

.014 

(1+x)"1  .603  .538 

n  156.  116.  107 

p  .329  .345  .317 

r  .831  1.014  .729 

p  .390  .424  .351 

r  1.080  1.543  .849 

p*  .371  .326  .389 

r*  1.000  1.012  1.000 
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Beta 

The  beta  density  has  the  form 


^^-hStt?)*""1^6"1  <«■"> 


for  0  <  x  <  1,  0<ct,  0< 
and  the  partial  derivatives  of  its  logarithm  are 


S  =  ijj(a+S)  -  ^ (a)  +  In  x 
a 

S     =   .]j(a+3)  -  ip(3)  +  ln(l-x) 


The  information  matrix  is,  for   6  =  a,  9 


(4.38) 


(  ^'  (a)  -  ip'(a+6)  -i/j'  (ch-6)    | 

A  -  )  (4.39) 

(     -  -KafB)         ip'(B)  -  ^'(ch-6)  ) 


The  system  of  maximum  likelihood  equations 


In  x  =  iKa)  -  ^(crt-S) 


ln(l-x)  =  ^(3)  -  ip(a+6) 


(4.40) 


uses  the  geometric  means  of  x  and   1-x,  and  is  difficult  to  solve.   What 
other  pairs  of  statistics  might  be  substituted? 


Clearly   x  and   (1-x)   cannot  be  paired  since  they  are  functionally 
related.   The  latter  is  merely   1-x.   Let  us  first  develop  the  ordinary 
method  of  moments.   Using  (E.4)  and  (E.5),  choose  the  systems 

(a+3)x  -  a  =  0 


(4.41) 


(a+3)2  (a+3+l)s2  -  a3  =  0 


and  solve  explicitly  for 

29 


2   -   -  2 

_  s   -  x(l-x)  "  _  s   -  x(l-x) 

a  = ,  3  =  3 (4.42) 

1-x  x 


It  may  occur  that  a  <  0,  8  <  0. 

The  coefficient  matrix  A  of  (2.2)  is 


W-^rTTTT  (*•«*) 


(a+B)3(a+0+l)  Cov(x,s2) 


(a+8)3(a+B+l)  Cov(x,s2)       (a+B)  (a+6+1)2  var(s2) 


1 


(4.45) 


and  the  use  of  (E.2)  thru  (A.l)  and  (A. 2)  does  not  appear  to  simplify  in 
any  useful  way.   Appendix  F  contains  programs  to  compute   |MJ  =  |c|/|a|  . 
Let  us  turn  to  the  pair  of  statistics 


,  n  n 

y  =  ~  I  —  z   =  -  V  ^^-  (4.46) 

J        n  t*  x.  n  4  i-x. 

li  1    i 


which  are  not  functionally  related.   Using  (E.8)  we  may  choose  the  system 

(a-l)y  -  (a+B-1)  =  0 

(4.47) 
(8-1) z  -  (a+8-1)  =  0 

for  a  >  1,   8  >  1.   The  solution  is 
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*      v(z-l)  *      z(y-l) 

a   =  " L-  6   =  y  (4.48) 

yz  -  y  -  z  yz  -  y  -  z    v    ; 


Clearly  y  >  1  and   z  >  1  but  the  denominators  are  not  necessarily  positive 
since   zy-y-z+l=  (y-l)(z-l)   can  be  less  than  one. 
For  the  system  (4.47)  the  coefficient  matrix  becomes 


-1 

a 
1-1 

and  with  the  help  of  (E.9)  and  (E.10)  one  can  calculate 

-1 


(4.49) 


,  a-2 
C*  ■■ 


5-2 


for  a  >  2,  B  >  2.   Use  of  (4.49)  and  (4.50)  in  (2.7)  does  not  produce  a  con- 
venient expression  for  M  .   However  its  determinant  is  easily  managed. 

For  fun,  let  us  also  try  a  system  based  on  moments  of  order  one  and 
minus  one.   (This  would  seem  to  straddle  the  two  geometric  means  appearing 
in  maximum  likelihood.)   We  are  lead  to 


(a+8)x  -  a  =  0 
(a-l)y  -  (a+6-1)  =  0 


(4.51) 


and  the  resulting  estimate 


;*-lZ=12l  ~*=  (y-l)d-x)    (4>52) 

yx-1  yx-1 


J] 


and  this  satisfies  a      >  0,  3   >  0.   (But  do  not  forget  that  use  of  (E.6) 
in  (4.51)  requires   a  >  1.)   Proceeding  in  the  usual  way,  we  calculate 


a  +  8      a  + 

A*  =      6 

a  -  1 


aB 


I  a  +  6  +  1 

C*  =  { 

R  B (a+g+1) 

"6  a  -  2 


Again  M   does  not  have  a  convenient  form.   The  determinates  of  (4.52)  and 


(4.53)  are  expressed 


A   = 


Cl  -  2 


(a+3)(a-l) 

(4.55) 

9 


a  -  2 


for   a  >  2. 

The  efficiencies  of  (4.42),  (4.48),  and  (4.52)  are  compared  in  the 
tables  that  follow.   Table  4.4  contains  the  efficiency  of  the  ordinary  moment 
estimator.   Efficiency  is  high  if  a   is  not  too  far  from  8   and  both  are 
at  least  two.  Elsewhere  they  are  low,  but  still  the  choice  because  all  the 
numbers  in  Table  4.4  are  better  than  their  competitors  in  Tables  4.5  and  4.6. 
The  pair   (a  ,8  )   is  generally  better  than   (a  ,8  )   but  not  uniformly  so. 
It  matters  little  since   (a, 8)   is  the  "hands  down"  winner.   This  result 
parallels  what  was  learned  in  the  symmetric  beta  case.   The  variable  X 
is  unstable  in  this  population. 
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TABLE   4.4.      Eff(a,B)      BETA  POPULATION 


.5  1.0  1.5  2.0  2.5  3.0  3.5 


.5 

0\  4  93 

0.  554 

0c537 

0.507 

0  .  477 

0.451 

0.  429 

1.0 

0.554 

0.713 

0.747 

0.739 

0.719 

0  .696 

0.674 

1.5 

0.  537 

0.  747 

0.820 

0.  839 

0.  835 

0.822 

0  .806 

2.0 

0.507 

0.  739 

0.839 

0.  878 

0.389 

0.887 

0  .378 

2.5 

0.  477 

0.  719 

0.835 

0.  889 

C.912 

C.919 

0  .  918 

3.0 

0.451 

0.  696 

0  .822 

0.  867 

0.919 

0  .934 

0.938 

3.5 

0.  429 

0  .674 

0.  806 

C.  878 

0  .91  8 

0  .93  3 

0  .948 

4.0 

0.  411 

0.  653 

0.790 

0.  867 

0.912 

0  .938 

0.  952 

4.5 

0.  395 

0.  635 

0.774 

0.855 

0.  904 

0.  933 

0  .  951 

5.0 

0.  382 

0.  618 

0.758 

0  .843 

0.895 

0.  92  S 

0.  948 

5.5 

0.  370 

0.  504 

0.  744 

0.  831 

0.8&6 

0.921 

0.944 

6.0 

0.  360 

0.591 

0.731 

0.  61  9 

0.  876 

0.914 

0  .938 

6.5 

0.  351 

0.  579 

0.719 

0.809 

0.867 

0.906 

0  .  333 

7.0 

0.  344 

0.568 

0.709 

0.799 

0.858 

0  .899 

0.927 

4.0               4.5  5.0  5.5  6.0  6.5  7.0 

#5  0.411  0.395  0.382  0.370  0.360  0.351  0.344 

1^0  0.653  0.635  0.618  0.604  0.591  0.579  0.568 

1.5  0.790  0.774  0.758  0.744  0.731  0.719  0.709 

2.0  0.867  0.855  0.843  0.831  0.819  0.809  0.799 

2.5  0.912  0.904  0.895  0.886  0.876  0.867  0.858 

3.0  0.938  0.933  0.928  0.921  0.914  0.906  0.899 

3.5  0.952  0.951  0.943  0.944  0.938  0.933  0.927 

4.0  0.959  0.961  C.961  0.958  0.955  0.951  0.946 

4.5  0.961  C.966  0.968  0.968  0.966  0.963  0.960 

5.0  0.961  0.968  0.972  0.973  0.973  0.972  0.969 

5.5  0.958  0.966  0.973  0.976  0.977  0.977  0.976 

6.0  0.955  0.966  0.973  0.977  0.980  0.980  0.980 

6.5  0.951  0.963  0.972  0.977  0.980  0.982  0.983 

7.0  0.946  0.960  0.969  0.976  0.980  0.983  0.934 
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TABLE  4.6.   Eff(a  ,8  )   BETA  POPULATION 


1.0  1.5  2.0  2.5  3.0  3.5 


2.5 

0.11  9 

0.  202 

0.  258 

0  .  297 

0.326 

0.347 

0.364 

3.0 

0  .163 

0  .  276 

0.  353 

0.407 

0  .  447 

0.477 

0.500 

3.5 

0.1  84 

0.313 

0.  400 

0.461 

0  .  5  06 

0.540 

0.566 

4.0 

0.  196 

0.  333 

0.426 

0.492 

0.540 

0.  576 

0  .  605 

4.5 

0.  204 

0.  346 

0.  U43 

0.511 

0.561 

0.599 

0.629 

5.0 

0  .  209 

0.  355 

0.  454 

0.  524 

0.576 

0  .61  5 

0  .  645 

5.5 

0.  212 

0.  361 

0.  462 

0.534 

0.586 

0.  626 

0.  657 

6.0 

0.215 

0.  366 

0.  468 

0.5  41 

0.59H 

0.  634 

0.  666 

6.5 

0.  217 

0.  369 

0.  473 

0.546 

0.  600 

0.641 

0  .673 

7.0 

0.  21  8 

0.372 

0.476 

0.550 

0.604 

0.  645 

0.  678 

4-0  4.5  5.0  5.5  6.0  6.5  7.C 

2.5  0.378  0.389  0.398  0.406  0.412  0.418  0.423 

3.0  0.519  0.534  0.546  0.557  0.566  0.574  0.581 

3.5  0.588  0.605  0.619  0.632  0.642  C.651  0.659 

4.0  0.627  0.646  0.662  0.675  0.686  0.695  0.704 

4.5  0.653  0.672  0.688  0.702  0.714  0.724  0.733 

5.0  0.670  0.690  0.707  0.721  0.733  0.743  0.752 

5.5  0.682  0.703  0.720  0.734  0.746  0.757  0.766 

6.0  0.691  0.712  0.730  0.744  0.757  0.768  0.777 

6.5  0.698  0.719  0.737  0.752  0.765  0.776  0.785 

7.0  0.704  0.725  0.743  0.758  0.771  0.782  0.792 
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APPENDIX  A 

General  Variance  and  Covariance  Formulae 
Connecting  the  Mean,  Variance,  Harmonic  Mean 

Let  X1 , . . . ,X   be  a  random  sample  of  size  n  and  denote  the  sample 
mean  with 


i  n 

=  M  x.  , 


the  sample  variance  with 


s2   .   _i_  I      a±.-x)2    , 


the  inverse  of  the  harmonic  mean  with 


i  n   i 

y  =  -  y  — 

n  k  X  ' 

1   i 


and  the  inverse  of  the  shifted  harmonic  mean  with 

i  n    i 

Y'  =M  — ^-  . 

n  <i  1+X. 

1  l 

2  2 

It  is  well  known  that   E(X)  =  y,  E(s  )  =  a   when  the  population  mean,  \i , 

2  r  r 

and  variance,  a  ,  exist.   Let  m  =  E(X  )   and  \i      =   E(X-p)  .   The  follow- 


ing relationships  are  needed. 


-   2    1  3 

Cov(X,s  )  =  —  {m_  -  3num..  +  2m.} 
n   J     l   1     1 


-  U,  (A.l) 

n  J 
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Var(s  )  =  -  {m  -  4m  m.  +  3m„  -  40   +  -==-  o  }      (A.  2) 
n   4     j  1     I  n-1 

1  i  4    2   4, 

-  —  {p.  -  o  +  — -  a    } 
n   4        n-1 

2  2           2 

1  2       ?    yA  "  y9  (y,-2y0)    y   -  3y0 

Var(i  I    (X.-X)2)  -  -i-^  -  2  --4-J-2-  +  -*-5-£        (A. 3) 

1  n         n 


Cov(X,Y)  -  i  {1  -  E(X)  E(l/X)}  <  0  (A. 4) 


Cov(X,Y')  =  -  {1  -  [1  +  E(X)]  E(-r^-)}  <  0  (A. 5) 


Cov(s2,Y)  =  -  ^  {m2E(l/X)  +  m  -    2m^E(l/X)}  (A. 6) 

1   I  2       \ 

= {um-um..+vi) 

n   2  -1      -1 


Cov(s2,Y')  =  -  1  {[y2  -  (1+y)2]  E(^)+  1  +  y}        (A. 7) 


The  proofs  of  (A.l)  to  (A. 7)  follow.   It  is  convenient  to  record 
the  relationships 

2 

m2  =  M2 

3 

m„  =  y_  +  3y2M  +  y  (A. 8) 

2    4 
m^  =  y^  +  4y„y  +  6y2y  +  y 

To  enhance  the  readability,  the  symbols   E,  V,  C  will  be  used  to  denote 
expectation,  variance,  and  covariance  (resp.).   Parentheses  and  subscripts 
will  be  used  sparingly. 
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Proof  of  (A. 4) .   Consider 


EXY  =  4r  ELX.K1/X.) 

*•  1     J 

n  J 


-^2  {n  +  n(n-l)  EXE(1/X)} 


from  which  we  subtract  EXE(1/X)   to  produce  (A. 4).   The  fact  that  (A. 4)  is 

negative  follows  from  the  presumption  that  X  >  0  and  the  consequence  that 

the  harmonic  mean  is  less  than  the  arithmetic  mean.   Thus   (ave  X.)  x  (ave  ■=—) 

l         X. 

l 

unless  (all  x.  =  constant)  and  apply  the  law  of  large  numbers. 


Proof  of  (A. 5) .   Follows  from  the  fact  that  C(l  +  X)Y'  =  CXY'   and  the 


application  of  (A. 4). 


Proof  of  (A. 6) .   Consider 


n(n-l)  Es2Y  =  E  I   tt-  Z  (X.-X)2 
X. 

1     J 

1     2-2 
=  E  Z  f-   (EX^-nXZ) 

i    -1 

X2         X.X. 

=  E£E  —J-  -  —  ZZE  -i-1 
X.   n     X,k 


=  nEX  +  n(n-l)  EX2E  ^  -  EX  -  2(n-l)  EX 

-  (n-l)  EX2E  ^  -  (n-1) (n-2)  E2XE  ^ 
X  X 


and  divide  through  by  (n-l)  to  get 
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nEs2Y  =  -EX  +  (n-1)  EX2E  ^  -  (n-2)  E2XE  ~ 

A  A 


=  -mi  +  (n-l)m2E  |  -  (n-2)m2E  | 


2   1 
Then  subtract   n(m  -ro-)E  —  to  obtain  the  first  version  of  (A. 6).   The 

second  version  follows  from  (A. 8). 


o 
Proof  of  (A. 7) .   Follows  from  (A. 6)  because   s   and   u   are  invariant  under 

trans lations. 


Proof  of  (A.l).   Let  us  work  with 


EXs2  =   ,  ^   E{IX2EX.  -  -  (EX.)3} 
n(n-l)      l   j    n    l 


{nEX3  +  n(n-l)EX2EX  -  EX3  "  3(n-l)EX2EX-  (n-1) (n-2) E3X} 


n(n-l) 


-  {EX3  +  (n-3)EX2EX  -  (n-2)E3X> 
n 


1  3 

=  —  {m0  +  (n-3)m„m.  -  (n-2)m, } 
n    J         /  1         l 


3 
and  subtract  mm  -  m  .   The  second  version  follows  from  (A. 8) 


Proof  of  (A. 2) .   Let  us  begin  with 


(n-l)2Es4  =  E(ZX2)2  -  -  EEX2(EX.)2  +  4r  E(EX.)4  , 
l     n     l  2      l 

n 


and  treat  the  three  main  ingredients  separately. 


E(ZX2)2   =  nEX4  +  n(n-l)E2X2  (A. 9) 


0  0/  OO  OO  "3 

EX. (EX.)      =  nEX     +  n(n-l)E  X     +  n(n-l) (n-2)EX  EX  +   2n(n-l)EX  EX        (A. 10) 


E(EX.)4  =  nEX4  +  4n(n-l)EX3EX  +   3n(n-l)E2X2  +  6n(n-l) (n-2)EX2E2X 

+  n(n-l)(n-2)(n-3)E4X  (A. 11) 


The  proper  combination  of  (A. 9),  (A. 10),  and  (A. 11)  produces 


Es4  =  -  {EX4  -  4EX3EX 
n 

+  ^r  [(n2-2+3)E2X2-2(n-2)(n-3)EX2E2X+  (n-2)  (n-3)E4x]  } 
n— 1 


4 
The  subtraction  of      a        in  the   form 


4  2   2  2   2  4 

a      =   EX     -   2EX  E  X  +  EX 


yields 


2  4  3  22  424 

nVs      =   EX     -   4 EX  EX  +  3E  X     -   4o     +  — -  a 

n-1 


and  this  is  (A. 2).   The  second  version  follows  from  using  (A. 8)  and  modifying. 

Proof  of  (A. 3) .   The  variance  of  this  form  of  the  sample  variance  can  be 
developed  from  (A. 2)  using  (A. 8).   It  also  appears  in  [7,  p.  183]. 
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APPENDIX  B 
Moments  of  the  Poisson 
The  Poisson  random  variable  X  had  density 

f(x;A)  =  e"A  AX/xI  ,  x  =  0,1,2,...  (B.l) 


2 
and  it  is  well  known  that  \i    =  X,  a   =  X.   The  probability  generating  function 


is 


G(u)  =  E(uX)  =  e"A(1_u)  (B.2) 


and  the  moment  generating  function  can  be  obtained  from  (B.2)  by  the  replace- 
ment  e    for  u.   Moments  can  be  obtained  by  repeated  differentiation. 
We  record 

E(X)  =  X 

2     2 
E(X  )  =  X   +  X 

E(X3)  =  A  +  3A2  +  A3  (E-3) 

h  2      3     3 

.  .  E(X  )  =  X  +  1\      +   6X"  +  XJ 

Use  of  (B.3)  into  (A. 2)  produces 

U3  =  X  +  3X2  (B.A) 


Var(s2)  =  -  {2X2  +  X}  +  o(-)  (B.5) 

n  n 


Cov(X,s2)  =  -  X  (B.6) 

n 
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The  moments  of   (1+X)     can  be  obtained  by  integrating  the 


generating  function  (B.2) 


Ei-r^r)    =  E  /   uXdu  =  /   G(u)  du  =  e~A/   eAu  du  -  f  (l-e~A)   (B.7) 
i+X      0        0  0  A 


E(T^7>2  =  E  J   /  "XvX  du  dv  =  //  G(uv)  du  dv  =  e  A  //  eXuV  du  dv 
1+X       0  0 

-X  1  Xu  -       -A  °°  1  .  j  i-1 

-  V  /   V1  du  -  V  II   ^rr-  du 

x    o  u               x    l  o       2  ■ 

oo  j  —  2. 

=  e"A   I  Wr                                    (B.8) 

-i  J  J  • 


This  opportunity  is  taken  to  record  an  alternative  way  of  obtaining 
moments,  (B.2),  which  in  this  case  is  somewhat  easier  than  the  differentiation  i 

of  the  moment  generating  function.   One  begins  with  the  generating  function 

X 
G(u)  =  E(u  )   and  replaces  the  argument   u  by  a  product  of  dummy  variable 

uv  • • •  z   containing  as  many  factors  as  the  order  of  the  moment  to  be  calcu- 
lated.  Then  one  takes  a  partial  derivative  with  respect  to  each  variable 

u,  v,  etc.  and  the  desired  moment  is  obtained  when  each  variable  is  set  to 

2 
unity.   For  example,  E(X  )   can  be  obtained  from 


2  2 

5   =  -r\-   G(uv)  =  T-~-  E(uv)X  =  E(X2uX"1vX"1)     (B.9) 
uv   3u  9v  8u  3v 


The  four  moments  (B.3)  can  be  obtained  in  this  way  replacing  u 
with  uvwz   in  (B.2).   The  resulting  derivatives  are 
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G  =  AG 
u 

G   =  AG[1+  Auvwz] 

2  2 

G    =  AG{z[l  +  Auvwz]  +  Auvwz  [1  +  Auvwz]  +  Auvwz  ) 
uvw 

G     =   Auvw  G    +  AG{[1  +  Auvwz]  +  zAuvw  +  2zAuvw(l+Auvwz) 
uvwz  uvw      L 

,22222        _    ,         , 
+   A    u  v  w   z      4-  ZzAuvwi 
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APPENDIX  C 


Moments  of  the  Gamma 


The  gamma  random  variable  X  has  density 


f(x;a,6)  -  — xa  Xe  x/8  (C.l) 

r(a)Ba 


and  it  is  well-known  that 


M    =   ag  a   =  a6  (C.2) 


Direct  integration  produces  the  formula,  for  r  >  0 


r(a) 

and  for  r  <  a 


E(Xr)  =  6r  *£&-  (C.3) 


3 
Use  of  (C.3)  in  (A.l)  and  (A. 2)  produces 


E(x"r)  ^rgr1  (c-4) 


Var(X)  =  -  a62  (C.5) 

n 


Cov(X,s2)  =  -  aB3  (C.6) 


Var(s2)  =  2a64  (-\  +  -)  (C.7) 

n-1   n 


Using  (C.4)  one  readily  calculates 
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E(i)=B^iy  i<«  <c-8> 


Var(^)  =  — \ 2  <  a  (C.9) 

B  (a-l)^(a-2) 


and  letting  Y  =  —  \      (1/X.),  we  find  that,  with  the  aid  of  (A. 4), 


Cov(X,Y)  =  ^^y  !  <  a  (CIO) 


and  with  the  aid  of  (A. 6) 


Cov(s2,Y)  =  0  (C.ll) 


Because  of  the  formula 


T'(y)  =  /   ln(y)  y""1  e  y  dy  (C.12) 

0 


one  can  develop,  for  r  >  -  a 


E(Xr   In  X)    =   3r  Fr(Ty    (In  8   +  ^(a+r))  (C.13) 


since     I"  (a)    =  r(a)    i>(a) 
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APPENDIX  D 


Moments  of  the  Negative  Binomial 


The  density  has  the  form 


ct  ^    r(r+x)   x  r       r-  _  n  /„  -.», 

f(x;,r,p)_  =  —r— r~r-  q  p        for  x  =  0,l,...  (D.l) 


xlr(r) 


Its  probability  generating  function 


0<r,0<p<l,    p+q   =   1 


X, 


G(u)    =  E(u   )    =  —2 (D.2) 

d-qu)r 


will  be  exploited  broadly.   Successive  derivatives  of  (D.2)  evaluated  at 
u  =  1  produce  the  factorial  moments 


m(s)  =  EX(X-l)  •••  (X-s+1)  =  (q/p)S  r(r+l)  •••  (r+s-1)       (D.3) 


to  which  one  may  apply  some  orderly  substitution  and  obtain  the  first  four 
moments,  using  A  =  q/p 

EX  =  Ar 

EX2  =  A2r(r+1)  +  Ar 

(D.4) 

EX3  =  A3r(r+l)(r+2)  +  3A2r(r+l)  +  Ar 

EX4  =  A4r(r+l)(r+2) (r+3)  +  5A3r (r+1) (r+2)  +  4A2r(r+l)  +  Ar 

The  mean  and  variance  of  the  population  are 
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y  =  rq/p 


2      .  2 
a   =  rq/p 


(D.5) 


Use  of  (D.4)  in  (A.l)  and  (A. 2)  provides  the  covariance  matrix  for 
the  ordinary  method  of  moments 


2  2    2 

n  Var(X)  =  A  r  +  Ar  =  rq/p  =  o 


n  Cov(X,s  ) 


=  2A3r  +  3A2r  +  Ar  =  a2  ^ 

P 


n  Var(s2)  =  2A  r(r+3)  +  4A3r(r+3)  +  A2r(2r+7)  +  Ar 
=  o2[2(r+3)q+p2]/p2 


(D.6) 


The  harmonic  mean  alternative  requires 


Ml+X;    (r-l)q 


(D.7) 


r    1 


(^)2  =  (^J0  utT^i-1]^ 


(D.8) 


for   r  4-   1.   The  case   r  =  1   is  treated  in  [12,  Appendix  A].   Expressions 
(D.7)  and  (D.8)  are  justified  next  along  with  computational  formulae  for 
(D.8)  when   r   is  either  a  whole  number  or  a  whole  number  plus  0.5. 


Proof  of  (D. 7) .   The  generating  function  (D.2)  is  integrated, 

n=l 


e(^v-)  =  J^CiOdu  =  p'/1  -4s 


^P_ 


•1+X' 


o 


0       (1-qu)2  (r-l)(-q)(l-qu)r    X 


=      P-P 

'    (r-l)q 


n=0 


as   required. 
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Proof  of  (D. 8) .   Replace  u  by  uv  and  integrate  twice. 


E( 


^)2  =  /r(UV)d"dV°H<r-C(1,q:v)r-l 


u=l 
u=0 


r    1 


7-hr  I  ~  r — L— r  -  A  du 

(r-l)q  J  u|_(1_qu)r-l    J 


as  required. 

Computational  formulae  require  managing  integrals  of  the  type 


-j1!"— — ^ 

0  uu(l-qu)      J 


(D.9) 


Clearly  J  =  0.   It  is  useful  to  apply  the  partial  fraction  representation 
repeatedly. 


A— + L 


u(l-qu)       (1-qu)     u(l-qu) 


(D.10) 


Since 


}     q  ^     =  jl  r_i_  _    I 

0   (1-qu)         <-p       J 


we  have,  from  (D.10) 


r  -J  .  +_L.r_JL.il 

r    r-1   r-1  (_  r-1 


Let  <r>  be  the  greatest  integer  in  r  and  apply  the  above  formula  to  get 
the  representation 

<r> 


3  .yj^ri     !]„ 

r   j=i  r"J  Lpr"J    J    r-<r> 


(D.ll) 


which  is  valid  for   r  >  <r>. 
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Let   r  = < r>.   One  can  show  directly  from  (D.9)  that 


1        o      (1-qu) 


-  In  p 


(D.12) 


and    then 


-£*&*-']- 


i     r-!   - 

I   r  ~r_J      r- 


lnp-~     I     t   [Pr   J~Pr]    "Inp      (D.13) 


Pr  3=1  J 


2      r 

Accordingly  it  is  recognized  from  (D.8)  and  (D.9)  that   E(l/1+X)  =  [pr/ (r-l)q] J 
which,  together  with  (D.7)  and  (D.13)  enables 


Var 


(^>  =  c^j£i[pr~V]-prinp-^! 


(D.14) 


for   r  an  integer    2   (empty  sum  is  zero),  and  for   r  =  1, 


Var(^r)  =  *{  I     4-£  dnp)2) 


•1+X'    q   .  ,  .2 

J-l  J 


(D.15) 


This  latter  formula  is  developed  in  [12],  see  (A. 3)  and  (A. 5). 


Let   r  =  <  r>  +  .5.   The  exploitation  of  (D.ll)  requires  dealing  with 


:-<r>   Jl/2   £ 


1  r 


P\ u 

I-  u^l-qu       J 


du 


Making  the  change   qu  =  sin"  9    provides  for  manageable  integrals  of 
trigonometric  functions.   See  [14,  p.  316].   The  result  is 


Jl/2  =  2  ln 


1  +  /p 


(D.16) 
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and 


E(iV  =  t^h  {X^  [pV]  +  p"  J^} 


(D.17) 


which  is  valid  for  <r>  ^  1.   For   r  =  1/2  one  needs,  from  (D.9), 


J-l/2  =  I 
x/     0 


/l-qu    1 


u     u 


du 


=  2  /l= 


qu 


u=l 


u=0 


+ 


0   '-uv'l-qu       J 


=  2/p  -  2  +  J 


1/2 


(D.18) 


using  formula  [14,  p. 316].   From  (D.8) 


E(lfe)2  -  -  f 'P  f*7-  2  +  J1/2>  -f  {^-  P-  'F  m  ^^} 


for  r  =  1/2. 


(D.19) 


The  information  matrix  (4.19)  contains  a  difficult  element  A   > 
(4.20).   It  may  be  managed  using  an  integral  representation  of  the  tri- 
gamma  function  [1,  p.  259], 

1 


In  u  r-1 


i  t  /  x      f   xn  u  i 
i|>'(r)  =  -  J   — —  u 

0 


(D.20) 


It  follows  that 


22 


=  i^'(r)    -  E{<f,'(r+X)} 

f      ln  u    r    r_1        vr    r-1+XM    a 
=   -   J      ■= [u  -   E(u  )  ]    du 

0 

=  -    f1  ^  u^1  l"l ^—  r   " 

0     X"u  L  (i-qu)r       J 


(D.21) 


50 


This  relationship  can  be  expressed  better  if  we  make  the  change  of  variable 

w  =  pu/(l-qu) 
and  manipulate  to  obtain 

"  J  ^(Py  wr-l  dw  (D.22) 


1 
A 

0 


There  is  advantage  in  using  (D.22)  in  the  development  of  the  determinant  of 
A,  (4.19), 

l  a  l  -    r   f   ln(p+qw)   r-1  ,     1 

|A| 2  /     1  -  w  W    dw  -~2 

qp   0  p 

1   {-l/1  ln(p+qw)  d(wr   _  1} 
p2     q  0  x   "  w 

--i-  J1  wr  d(4^3±wl)  (D.23) 

2   n  1  -  W 

qp    0 


using  partial  integration  and 


oo   n 


ln(p+qw)  =  _  y   q_  a_l/j)n~1 


1  -  w       i   n 


In  this  form  it  is  easily  checked  that 


, .  •   i.i  1   r1  ,/ln(p+qw),  -ln(p)-^ 

limit  A  =  — r-  J  d(  .  ^  ^ — )  dw  =  "^T^ 

11  Z  '  1  —  w  z 

r  -*■  0  qp   0  qp 


limit  | A |  =  0 


r 


For  computational  purposes  one  can  exploit  the  expansion 


(D.24) 
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oo   -n 


iSiE+gw!)  =  J  3_(n-l)(l-w)n-2dw 
1  -  w      Ln    n 


1=2 


which,  when  used  in  (D.23),  produces 


1 
0 


»  i     1  r   n  n-l  f    *  /->      \ft-2  , 
A  |  =  — j  i  1  "V"  '   W  (    ) 


1   J  qn^.r!(n-l).V 
p2  J    n+1  (n+r)  » 


_  V  -9-  r-  n 


2  ^n  n+1   (n+r)! 
p  n=l 


Similar  efforts  applied  to  (D.22)  can  produce 


n 


A22=  L^-^tyf1 


n=l  n 
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APPENDIX  E 
Moments  of  the  Beta  Distribution 

The  Beta  random  variable  X  has  density 

nx'a,S)   r(a)  r(s)    (  x) 

for  0  £  x  £  1,  0  <  a,  0  <  6       (E.l) 
and 

oo 

T(a)  =  J   x   e   dx  . 
0 

This  is  called  the  B(a,B)   distribution  and  it  is  useful  to  note  that  1-X 
has  a   B(3,a)   distribution. 

Moments  are  obtained  directly  by  manipulating  gamma  and  beta  functions 
For  r  >  0 

WYr^    r(q-ffi)  r(q+r)  _  (ct+g-l).'  (ct+r-1)  I  .        . 

MA  ;   T(a)  r(a+6+r)  '  (a-1)  !  (a+3  +  r-1)  !  ^t,z; 


f  r     s   ..  r(a-H3)  T  (a+r)  T  (S+s)  ,        . 

elx  (l-x;    j  -r(a)  r(g)  r(a46+r+s)  tE.J; 


and  the  mean  and  variance  follow 

_  _o (E.4) 

U   a  +  6 

2        a6 


o   = 


(a+6)2  (a 46+D 


(E.5) 


Moments  of  negative  order  exist  if  a   is  large  enough.   If 
a  >  r,  6  >  s   then 
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-r  r(g-fg)    r(a-r) 

E(x     )       r(o)   r(a-HS-r)  (E'5) 


Fry"rWl-Yrs   -     r(a+6)        r(a-r)    r(B-s)  ,        . 

E(X      )(1   X)         -   r(a)    r(6)    r(a+6_r_s)  (E.7) 


and    the  harmonic  mean  option  will  be   available, 


B<!>-l+£  1<«  (E.8) 


Var(I)    =     g^-1)  2   <   a  (E.9) 


(a-1)2    (a-: 


and  if     a   >    1,    6   >    1 


^f-i^'-c^Hh)  <E-10> 


Symmetric  Beta 

Set   a  =  8   and  obtain 


P  =  1/2  (E.ll) 


a2  =  4(2k)-  (E'12) 

E(|)  =  ^Zl  !  <  a  (E.13) 


Var(|)  =  *<*?» 2  <  a  (E.14) 

(a-l)Z  (a-2) 


Cov(x'  i^x}  = "  "^^i  x  <  a  (E-15) 

(a-1) 
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APPENDIX  F 


APL  PROGRAMS 


The  function  HARP  performs  the  iteration   (3.7)  to  estimate  the 
Poisson  parameter  from  the  harmonic  mean.   The  left  argument  X   is  the 
initial  value  (usually  x)  and  the  right  argument  is   y   of  (3.6).   The 
function  HMV  computes  the  variance  of   y  using  (B.7)  and  (B.8).   The 
right  argument  is  the  set  of  parameter  values,  A. 


V  L+X    HARP    Y\LL 

[1]  L+X 
[2]  LlzLL+L 

[3]  L«-(l-*-L)  rY 

[4]  L 

[5]  -+L1*\  (  \L-LL  )>0.  0001 

V 


V  V-HMV   L\B\D\M\U 

[  1  ]  iV«-  5  0 

[2]  V+(L**1)*.*\N 

[3  ]  D+{  (M<-1  tp  V)  ,;V)p(  *-\N)*(  \N)  **  \N 

[4]  B+$(N ,M)p*-L 

[5]  V++/B*V*D 

[5]  V+V*L 

[7]  V+-V-{  (l-*-L)*L)*2 


The    function  MONT  produces      E(X   )      for    the   symmetric  beta   distri- 
bution using    (E.2)    with     a    =  6.      The   left   argument   is    the    (integral)    order 
of    the   moment   and    the   right   argument   is    the   parameter.      The    function  VAR 
computes    the   variance   of    the   estimator,    a   of    (3.16)    using    (3.17)    and 

(A. 2).      Again   the   argument    is    the   parameter 
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V    Z+R   MONT    A;  N',I 

Ci]       l^o 

[2]  ZHN  tN+pA)pl 

[3]       Ll:J«-I  +  l 

[4]  Z«-Z*($(NtN)pA+I-l  )*Ao.+A+I-l 

[5]         -+Llx\I<R 


[1]  tM  4    MONT    A)-(ttx(3    A/0^2*   A)x(l    MW   A))-(3x(2    A/OAT   4  )  *2  )  -  M  (  4+8x4 

[2]  K<-4x  Kx(l  +  2x4  )*4 

V 


The  polygamma   functions   are   computed   using   PSI   and   JEX.      When   the 
(scalar)    left   argument,    N,    is    zero    the  psi    function   is   produced.      Integral 
values   of      N     index   the   order   of    the   derivative   of   psi.      The   argument   of 
the   function  appears   on   the   right.      The   technique   comes   from   the   asymptotic 
expansions   in  Abramowitz. 


1] 
2] 
3] 

4] 
5] 
6] 
7] 
8] 
9] 


V    P+-N   PSI    YiCiIV;JIViK',KK;YYiV;Z;TiI 


n    N    IS    THE    ORDER   OF   THE   DERIVATIVE   OF   THE    PSI    FUNCTION 

ft    Y    (>0)    IS    THE  ARGUMENT .SCALAR   OR    VECTOR 

C+1Q 

IV+\pY+-  ,1 

P+Z+K+-(pY)pO 

KK+-[C-YLJIV+-(V«-Y<C)/IV] 

-+L1XV  (pIV)=p(~V)/IV 

T+YWIV1 

I  +  Q 

10]    L2:I+I+1 
11]       YY+KKillpTHl 

12]       Z[I>(.'AO*+/(  (  YY-1  )  +  \KKLI^)*-li-N 
13]       -+L2x\I<pJIV 
14]       Z+V\ZiipJIV] 
15]      K+V\KK 
16]   Ll:+Slx\N>0 
17]       F«--(a£  +  y)-(  2xA:+Y)*-l 
18]      -+2+J26 

19]    51:  ?^(  (  .'iV-l)x(  Y+K)*-N)+(  !tf  )x0.5x(  Y+K)  *-N  +  l 
20]       P«-(  (     l)*N+l)*P+Z+N   JEX    Y  +  K 
V 


56 


V  J+N   JEX    Y\C\M\F\E\A 

[1]  A    USED    IN    THE    POLYGAMMA    FUNCTIONS 

[2]  C--CH  5,(7*5)f(5*7),( 11*25), (5* 455*11x691), (6  91*7x^55)) 

[3]  C«-(  (pY^.Y)  ,6)pC 

[4]  M^7 

[5]  F«-(  y*2)o.x(2+2MM-l)x(lt2xiW-l)?(//+2MM-l)x(//+l  +  2MW-l) 

[6]  £+-CxF 

[7]  AH  7t6  )x(y*-iV+2x,V)x(  :"l  +  A7+2x^)  *<  »  2*1-1) 

[8]  J^x  !  +  £"[;  6  ]xl+£-[  ;5]xl+ff[;u]xi+ff[;3]xi+E[;2]xl+5[;l] 


The   efficiency   of    the   usual  moment   estimator    (4.5)    of   gamma   distri- 
bution parameters   is   produced   by   EFF   using    (4.8)    and    (4.14).      The   efficiency 
of    (4.10)    is    computed  by   EFFH  using    (4.13)    and    (4.14),    and    the   relative 
efficiency    (4.15)    is    the    function  REFF.      Only    the   parameter      a      is   needed 
and   it   is   entered   on   the   right    for   all   three.      It   must   be      >    2    for    the 
latter   two   functions. 


V    E-EFF    Y 
[1]  2>(2x( y+i )x( (Yx(i     PSI    Y))-l))*-l 

7 


V    E+-EFFH    Y 
[1]  E+-2H  Y-2  )*(  Y-l  )*2 

[2]         E*-Ex(Y*l    PSI    Y)-l 
[3]         E+E*-l 

V 


V    Z+REFF   A 
[1]  Z+{A-l)x(A-l)HA  +  l)*A-2 

V 
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We   turn  now   to    the   programs   that   support    the   negative  binomial 
distribution.      The   function     MM     computes    the   ordinary  method  of  moments 

estimators      p    ,    r      of    (4.25).      The    left   argument   is      x     and   the   right 

,.    .  2 

argument   is      s    . 

V  M+X  MM    S 

[1]         P*-XiS 

[2]  R+(X*2)iS-X 

V 

The   determinant   of    the  asymptotic   covariance  matrix  of      p,    r      is   produced 

(see    (4.29))    by  DM,   whose   left   and   right   arguments   are      r     and      p. 

? 
The   function  PSIB   computes      x    (=  XB) ,    s      (=   S) ,    y    (=  Y) ,    the 

estimators      p,    r      (using     MM),    and   the  maximum  likelihood   estimate     p,    r 

by   applying   Newton's   method   to    (4.18)    using     p,    f      for   initialization. 

The   left   argument,    F,    is    the  vector   of   observed   frequencies    corresponding 

to    the   right   argument      J,    the  vector   of  variate  values. 


V  Z<-F    PSIB    J  \PH\PKD\  Z\ZZ 

[I]  XB+(+/J*F)i+/F 

[2]  S-{  *(+/F)-l)x(  +/F*J*2)-(+/F)*XB*2 

[3]  Y«-(+/FtJ+1  )i  +  /F 

[4]  XB   MM    S 

[5]  R,P 

[6]  LliPP+F 

[7]  Z-+/(0    PSI    /?W)xFt+/F 

[8]  ZZ«-  +  /(1     PSI   R+J)*Fi+/F 

[9]  Ftf<-Z-(0    PSI    R)-(9R)-oR+XB 

[10]  PHD+ZZ-(1     PSI    R)-(rR)-iR+XB 

[II]  R+R-PH*PHD 
[12]  F+RiR+XB 
[13]  R,P 

[14]  -+Z/1X1  (  |  PP-P)  >0.  0  0  01 
V 
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The   function  INF1   computes    the   difficult   element   of    the   information  matrix, 
^22     °f    (^.20),    using    (D.26).      The   left   and   right   arguments   are      r      and      p, 
resp.      The  vector      r     must   be  whole   numbers   except   it  will   also   handle    the 
values    .5,    1.5,    ...    ,    4.5. 


V    Z+R   INF1    PiN  ;A  iBiC  i  ZZiJ  iVifll  ;  NK 
[  1  3         A r«-  \  M 
[2]         C+(l-P+-,P)o  ,*N 

[3  ]         A  +  {  (pAT)  tpR+-,R)pQ 
[4]  J*-Q 

[5]  L1:JW  +  1 

[6]  V+(N  +  R [</.])  >55 

[7]  -+L2x\{pN)=pNN4-(~V)/N 

[8]  ZZ*-(.pNl*-V/N)pO 

[9]         AlNliJ  }  +  x/(  ZZ°.+ii?[J]-l  )  T/7lo.  +  li?[e/].1 

[10]       ->L  3  x  i  (  p  NN  )  =  0 

[11]  L2:AINNW  X  .'  //iY )  *(  »/?[J  ]  -1  )  v  •  NK+RU  ]  -1 

[12]  L3:+Llxlc/<ptf 

[13]       B-«-$(  (pi?)  tpN)pN*2 
[14]       Z«-C+.x,US 
V 


The  function  DETI  and  DI  both  compute  the  determinant  of  the  information 
matrix,  but  the  former  uses  INF1  in  (4.19)  and  can  handle  the  same  set 
of   r  values  which  are  whole  numbers  plus  .5.   The  latter,  DI,  uses  (D.25) 
and  all  values  of   r  must  be  whole  numbers. 


V  V+R    DETI    P 

[1]  V-(R    INF1    P)x«?(i?+-,i?)».v(P*2)xi-P 

[2]  V<-V-<${(pR)  tpP)pP*-2 

V 
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7  D+R  DI   P\A\B\C\N\Pl\J  \Z 

[I]  N+\M 

[2]  CHl'P-,P)°.*N 

[3]  Pl+-fi>(  (pR+tR)  ,pP)pP*-2 

[4]  4«-(  (ptf)  ,pP«-,P)pO 

[5]  Z^(ptf)pO 

C6]  J*-Q 

[7]      Ll:J"W+i 

C8]  AliJl+x/(  Z°  .+\RU1)*N°  .  +  \RU1 

[9]  -»-Llxit7<pi? 

[10]  B+$({pR)  ,QN)pN+l 

[II]  Z?«-Pl*Z?-t-C  +  .  *A\B 
V 


The  determinant  (4.29)  of  the  asymptotic  covariance  matrix  (4.28) 
is  computed  by  the  function  DM 


7  D-R   DM    P 
[1]    0«-2x(l+/?4-,/?)o.x(p*2  )U-P«-,P 
7 

and  the  efficiency  of  the  moment  estimator  is  NBEF 

7  E+-R    NBEF    P 
[1]    £>*(/?  DM    p)*$r   DETI   P 
7 


The  function  HAR  implements  the  iteration  scheme  described  in 
(4.32)  and  produces  the  harmonic  mean  based  alternative  estimators  p  ,  r 
from  the  system  (4.30).   The  left  argument  is   x  and  the  right  argument 

is  y . 

7    P+X    HAR    I\PP\C;G\GP\Z 
[1]         P+Q+0.5 

[2]         C+Y-1-YxX 
[3]  R+X 

[4]       LliPP+P 
[5]         G+(Z+P*R)-X-CxP 
[6]  GP+C  +  Z*X*(Q+<}P)iQ*2 

[7]         P+P-GiGP 
[8]  R+-X*PiQ*-l-P 

[9]  P.P.G 

[10]       +L1XI ( | C)>1E"6 
7 
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[1 J 

C2] 

[3] 

[4] 

[5] 

[6] 

[7] 

[a] 

[9] 

[10] 

[11  ] 

[12] 

[13  ] 

[14] 

[15] 

[16] 

[17] 

[18] 

[19] 

[20] 

[21  ] 

[22] 

[23] 

[24] 

[2  5] 

[26] 

[27] 

[2b] 

[29] 

[30] 

[31  ] 

[32] 

[33] 

[34] 

[35] 

[36  ] 

[37] 

The   Var(l/1+X)      is   computed  by   VHM1   using    (D.14)    and    (D.15)    for      r     values 
that  are  whole   numbers,    and   using    (D.17)    and    (D.19)    for      r     values    that 
are  half   way   between  whole   numbers. 


V    Z+R    VHM1    P\AA\BB\B\I\PP\RR\S\H\RH\HHH\M\U1\U2\J\N\11N\Q\SS 

fl     VAR   OF    *{1+X)    FOR    NEG    BIN(RtP)iR    MUST    BE    WHOLE   OR    WHOLE    FLUS    .5 

RH~U 2«-/?  =  0.  5)/P<-(~£/l+-P  =  l)/P 

Z  «-£«-<  (RR+pR+-  ,R)  ,PP«-pP«-,P)pO 

-*L3*i  (  pP)=0 

I«-0 
LI:  I +1+1 

B+RUl-1 

->(  3  +  J26  )  M/J*LB 

B«-B-l 

SHi  >-»P 

-►(  2+126  )*\B  =  IB 

SlI; ]«-2x»2*l  +  P*0. 5 

BB«-(PP,LB)pP[I]-l  +  iLB 

AA-(P°  .  *-/?[I]-l+iLB)-l 

SUi  1+SUi  1+  +  /AASBB 

+Ll*\I<LRR<-pR 
L3:R+-(~U2)\R 

+Llx;  (  pP)  =  0 

RlU2/\ RR-RR++/ U2]+G .5 

S«-(~t/2)\[l]    S 

-►(  2+l26)*i  Q  =  +  /U2 

SLU2/\RR't  ]-*-+(  2xa2Tl  +  P*0.5)  +  2x(p*0.5)-l 

Z«-5*  (#«-$?<>.  */?)  *HHH+(R-1  )»  .  xi-p 

AM {HH+(RR,PP)pP)-H) iHHH 

Z-Z-M*2 
Lt:JlM  2x950iPxi000)Tae*-l-P 

A'[(  7^/;<50)/  i  PPJ««-  5  0 

SS+PPp  0 

</«-0 
L2  :  J-*-J+  1 

SS«-+/(  Q°  .  *\NN)H  PP,NN)p(  \NNf[NUl)  *2 

•+L  2  *  i  J  <  PP 

-K  1+126  )x(  +/[/l  )  *0 

tf«-(~tfl)\i? 

RLUl/x  RR+-RR  +  +  /  Ul  ]«-l 

Z«-(~£/l)\[l  ]    z 

Z[//l/iPP;  ]^px(52-Px(  (aP)*2)^)i5 
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The   function  DMSI   computes    the   determinant   of     M  from   (4.36).      Two 

auxiliary   functions   are   needed:      G22   provides      g  and      g   _      from  the 

second   row  of      A        in    (4.33)    (compare    (2.17))    and   DMSI   is   needed   to   handle 
values   of      r  =   1,    special   handling  being  required  becuase   of    (4.34). 


V  X+R  DMSI   P;RRiPPiHHiCiLiG',G21iZiU 

[1]  AV(  (Pi?)  ,pP)po 

[2]  R+{~U+R  =  1)/R 

[3]  Z+-(L*-R*  .*(P*2)*Q4-1-P)*G+$R    (72  2    P 

[*+]  ZH  Z+((*G21  )iHH+(  (RR+pR) tPP+pP)pP)*2 

[5]  Z«-Zt(LxC^(  ((p-i  )°  ,xQ)*2)*R    VHM1    P)-<S?G21*2 

[6]  Xl(~V)/  \RR++/U;  >Z 

[7]  -►(2  +  l2  6)xlo=+/tf 

[8]  XlU/\RR++/Uil+DMSl    P 


V    Z+DMSl    P;W 
Cl]  Z«-(  (0.  5x(  ©P)*2)+(/^l-P-<»p)*2 

C2]         Z«-Z*(((pP)pi     VRM1    P)xC(l-P)*3))-(^xp)*2 


V  Z^P    G22     PiPP;RR;T;TT;H;HH-tV;W 

Cl  ]  #<K  (PP^pP«-,P)  ,PP^pP«-,P)pPo  .  *p 

[2]  HH^(RR,PP)pP 

C3]  Z+HH-H 

[5]  TM-yj/iPP 

[6]  T«-H*§(RRtPP)p<*P 

[7]  Z[;2*2']^[;r2']  +  Z[;72']*(pZ[;2,2,])p(-7)/i?-l 

[8]  G2lH(PP,RR)pR)*HiHH 

[9]  C21«-G21-(l-ff)T$(/?/?fPP)pi-p 
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The  efficiency  of  the  harmonic  mean  based  estimator   p  ,  r   is  computed 
by  the  function  EFF 


V  E+-R    EFF    P 
[1]    E*-{R    DMSI    P)i*)R    DETI    P 
V 

Both  EFi-  and  NBEF  used  DETI  and,  for  that  reason,  are  restricted  to  only 

a  few  fractional  values  of  r.   The  relative  efficiency  of  p  ,  r   with 

respect  to   p,  r   is  not  so  restricted.   It  is  computed  by  RELEF  and  accepts 
any  r  >  0. 

V  E+R    RELEF    P 

[1]    E+((R   DM    F)*(R   DMSI    P)) 
V 

Methods  for  computing  the  efficiency  of  beta  distribution  estimators 
are  supported  by  the  following  programs:   The  determinant  of  the  information 
matrix  (4.39)  is  computed  by  the  functions  DINF  and  DDINF.   The  former  takes 
a  single  (vector)  argument  and  produces  a  symmetric,  square  matrix  of 
values   |a|    for  all  pairs  of  components  of  the  arguments.   If  the  arguments 
a   and   8   must  be  entered  separately,  then  the  two  argument  DDINF  can  be 
used. 

V  L+-DINF   A'tN;B;C 
[1]    L+-  ,L*-A°  .  +A 

[2]    B-(N  ,N*-pA)  pi     PS  I   L 

[3]    L  +  (C°  .xC)-(C°  .+C+-1     FSI   A)*B 

V 


V  L+A    DDINF    B\M\N  \CA\CB\D 

[1]  L  «- ,  L  •*-/!  °  .  +  S 

[2]  0«-(  (M+qA  )  ,iV«-pfl)pl  PSI    L 

[3]  L+-(CA°  .*CB  )-(  (CA+1     PSI    A)*.+CB+1     PSI    3)*D 

V 
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The  function  DETMM  accepts  two  by  two  matrices  A   (on  the  left) 
and   C   (on  the  right)  and  computed  the  determinant   |M|  =  |c|/|a| 
of  (2.7).   The  function  ADET  uses  it  to  produce  an  array  of  such 
determinants.   The  arguments  H  and  C   are  four-dimensional  and  may  be 
thought  of  as  an  M  by  N  array  of  2  by  2  matrices.   The  left  set,  H, 
are  the  coefficient  arrays  A  of  (2.2)  and  the  right  set,  C,  are  the 
covariances  (2.6) 


V   M*-A    DETMM    C 

[1]         AMGU)  +  .x4£i4 

C2]  «<-(WCl;l]x1V[2;2])-W[l;2]*2 

V 


V   MM+B  ADET   C\N\I\J\M 
[1]         MM  (M«-lt~2  +  p#)  ,tf«-"l+p#)pO 
[2]  I-J  +  Q 

[3]      Z,1:J>I+1 

[5]  L2-.J+J  +  1 

[6]    MMtliJl+HliiIiJ']    DETMM    C[;;J;J] 

[7]    +L2*\J<N 

[  8  ]    -»L  1  x  1 1  <M 

V 


The  computation  of  the  efficiency  of  the  ordinary  moment  estimatoi 
(4.42)  requires  the  coefficient  array  (4.43)  and  the  convariance  array 
(4.45).   The  former  is  computed  by  the  function  COEFM  and  the  latter  by 
COVM.   Each  takes  a  single  vector  argument  and  computes  the  required 
values  for  all  pairs  of  components  in  the  argument.   The  function  COVM 
requires  the  beta  distribution  moments  (E.2)  and  these  are  computed  by 
MONT. 
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V  H+COEFM   A\N\B 

[1]  H+{ 2, 2,NtN-pA  )pl 

[2]  Hlliln]+(N,N)p-A 

[3]  Hll't2ii]+$(NtN)pA 

[U]  £«-(/!  °  ,  xyl  )x(4o.  +,4  )*(i4o.+4+l  ) 

[5]  ff[2;l;  ;  ]<-fl  +  M°.  -4  )x(//,tf  )P4 

[6]  #[2;  2;  ;  ]«-£+(  (-/I  )o.  +/1  )x$(  A' ,  #  )pyl 

[7]  H-H*i2,2tNtN)pA°.4-A 


V  Z«-/?    A/O/Vr    .4;  A';  J 

Cl]  1-0 

[2]  Z^-(ff,/^pyl)pl 
[3]       £1:I«-I+1 

[H]  Z«-Zx($(  A'.A'Jpd+J-l  )*4«.  +.4+1-1 

[  5  ]  -»£  1  x  x  J  <  fl 

V 


All    these   are   utilized   by   EFBM  which  computes    the   efficiency    (2.22). 

The   output   is   a   symmetric  matrix.      The   argument   must   have   positive   components 


V  E+-EFBM  A\L  \C  ;H\M 

[1]  L+DINF   A 

[2]  Ci-COVM  A  . 

[3]  H+-C0EFM   A 

C  u  ]  #«-#  4  z?£:n  c 

[5]  E+*M*L 


The  efficiency  of  (4.48)  is  handled  in  similar  fashion.   (Also 
with  single  arguments.)   An  array  of  2  by  2  matrices  (4.49)  is  produced 
by  the  function  COEF  and  the  matching  matrices  (4.50)  by  COV.   These  are 
used  by  EFBH  to  compute  a  symmetric  matrix  of  efficiencies.   The  argument 
must  have  all  components  >  2. 
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V    H-COEF   A  \N 
[1]  M2,2,iV,/IN-p>l)p-l 

[2]         ff[2;  2;  ;]«M°.M-1 
[3]         ff[l;l;  ;>$ff[2;2;  ;] 

V 


V    C+COV  A;N 
[1]  £«-(  2,  2,N,N+pA  )pi 


[2]  C[l;l 

[3]  C[l;2 

[4]  C[2;2 

V 


]«-(-4°  .  +X-1  )*<Wl  »  .  M-2 
]"K7[2;  1;  ;>M°.M-1 
>U°.M-1  )*U°.  M-2  ) 


V    E+EFBH   A\C\E\L 
[1]         L+DINF   A 
[2]         H+-COEF    A 
[3]         C^COV   /I 
[4]         M+-tf   4D£T    C 
[5]         E«-iM*L 


The   estimator    (4.52)    is  managed   in   like   fashion,    only    this    time 
the   arguments    a   (left)    and    8    (right)    must  be   entered   separately  with 
a    >  2,     S    >  0.      The   coefficients    (4.53)    are   computed  by   the   function  COEFH 
and    the    covariances    (4.54)    by  COVH.      These  are   drawn  on  by   EFBMH   to   produce 
the   efficienices 


[1] 
[2] 

[3] 
CO 
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